-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[MachineLICM] Do not rely on hasSideEffect when handling impdefs #145379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-aarch64 Author: Anatoly Trosinenko (atrosinenko) ChangesTake clobbered registers described as implicit-def operands into When hasSideEffect was set to 0 in the definition of LOADgotAUTH,
like the following:
Full diff: https://github.com/llvm/llvm-project/pull/145379.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/MachineLICM.cpp b/llvm/lib/CodeGen/MachineLICM.cpp
index c9079170ca575..5841a9ffa7a99 100644
--- a/llvm/lib/CodeGen/MachineLICM.cpp
+++ b/llvm/lib/CodeGen/MachineLICM.cpp
@@ -559,18 +559,15 @@ void MachineLICMImpl::ProcessMI(MachineInstr *MI, BitVector &RUDefs,
if (!MO.isDead())
// Non-dead implicit def? This cannot be hoisted.
RuledOut = true;
- // No need to check if a dead implicit def is also defined by
- // another instruction.
- continue;
+ } else {
+ // FIXME: For now, avoid instructions with multiple defs, unless
+ // it's a dead implicit def.
+ if (Def)
+ RuledOut = true;
+ else
+ Def = Reg;
}
- // FIXME: For now, avoid instructions with multiple defs, unless
- // it's a dead implicit def.
- if (Def)
- RuledOut = true;
- else
- Def = Reg;
-
// If we have already seen another instruction that defines the same
// register, then this is not safe. Two defs is indicated by setting a
// PhysRegClobbers bit.
diff --git a/llvm/test/CodeGen/AArch64/mlicm-implicit-defs.mir b/llvm/test/CodeGen/AArch64/mlicm-implicit-defs.mir
new file mode 100644
index 0000000000000..51cb35116dc62
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/mlicm-implicit-defs.mir
@@ -0,0 +1,53 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=aarch64 -run-pass machinelicm -verify-machineinstrs -o - %s | FileCheck %s
+# RUN: llc -mtriple=aarch64 -passes machinelicm -o - %s | FileCheck %s
+
+---
+name: test
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: test
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.1(0x80000000)
+ ; CHECK-NEXT: liveins: $x0
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: $x16 = COPY killed $x0
+ ; CHECK-NEXT: B %bb.1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1:
+ ; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
+ ; CHECK-NEXT: liveins: $x16
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: $x1 = COPY killed $x16
+ ; CHECK-NEXT: $x2 = MOVi64imm 1024, implicit-def dead $x16
+ ; CHECK-NEXT: $x16 = LDRXroX killed $x1, killed $x2, 0, 0
+ ; CHECK-NEXT: $xzr = SUBSXri $x16, 0, 0, implicit-def $nzcv
+ ; CHECK-NEXT: Bcc 1, %bb.1, implicit $nzcv
+ ; CHECK-NEXT: B %bb.2
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2:
+ ; CHECK-NEXT: liveins: $x1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: $x0 = COPY killed $x1
+ ; CHECK-NEXT: RET_ReallyLR
+ bb.0:
+ liveins: $x0
+ $x16 = COPY killed $x0
+ B %bb.1
+
+ bb.1:
+ liveins: $x16
+ $x1 = COPY killed $x16
+ /* MOVi64imm below mimics a pseudo instruction that doesn't have any */
+ /* unmodelled side effects, but uses x16 as a scratch register. */
+ $x2 = MOVi64imm 1024, implicit-def dead $x16
+ $x16 = LDRXroX killed $x1, killed $x2, 0, 0
+ $xzr = SUBSXri $x16, 0, 0, implicit-def $nzcv
+ Bcc 1, %bb.1, implicit $nzcv
+ B %bb.2
+
+ bb.2:
+ liveins: $x1
+ $x0 = COPY killed $x1
+ RET_ReallyLR
+...
|
05c3da9
to
079d654
Compare
Fixed the test failures: the original fix ruled out safe-to-move instructions with |
Take clobbered registers described as implicit-def operands into account when deciding if an instruction can be moved. When hasSideEffect was set to 0 in the definition of LOADgotAUTH, MultiSource/Benchmarks/Ptrdist/ks/ks test from llvm-test-suite started to crash. The issue was traced down to MachineLICM pass placing LOADgotAUTH right after an unrelated copy to x16 like rewriting this code: bb.0: renamable $x16 = COPY renamable $x12 B %bb.1 bb.1: ... /* use $x16 */ ... renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv /* use $x20 */ ... like the following: bb.0: renamable $x16 = COPY renamable $x12 renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv B %bb.1 bb.1: ... /* use $x16 */ ... /* use $x20 */ ...
079d654
to
707a36c
Compare
Ping. |
llvm/lib/CodeGen/MachineLICM.cpp
Outdated
@@ -585,6 +580,8 @@ void MachineLICMImpl::ProcessMI(MachineInstr *MI, BitVector &RUDefs, | |||
} | |||
|
|||
RUDefs.set(Unit); | |||
if (MO.isImplicit()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What makes implicit defs special here (compared to explicit defs)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped the special case in 4c29db1, thank you!
After investigating HoistRegionPostRA
and ProcessMI
functions for a while, it feels like the description of RUDefs
and RUClobbers
are hardly accurate:
BitVector RUDefs(NumRegUnits); // RUs defined once in the loop.
BitVector RUClobbers(NumRegUnits); // RUs defined more than once.
These two variables are defined in HoistRegionPostRA
and used both in HoistRegionPostRA
and ProcessMI
by set
ting and test
ing bits as well as by mass-set
tint bits in applyBitsNotInRegMaskToRegUnitsMask
.
I have an impression that RUClobbers
is sometimes set while it would be better to first check RUDefs
(and possibly set corresponding bit there instead), such as when handling MO.isRegMask()
case a few lines above - it looks harmless, even though slightly pessimistic, to me.
At the same time, RUDefs
seems to mix two different properties of a register: "the register is live" vs. "we have observed an instruction that defined the register that was initially dead" - while both should transition to "clobbered" state after defining that register one more time, only the latter implies some sort of non-uniformity. Thus, when HoistRegionPostRA
rejects candidate instructions as follows:
MachineInstr *MI = Candidate.MI;
for (const MachineOperand &MO : MI->all_uses()) {
if (!MO.getReg())
continue;
for (MCRegUnit Unit : TRI->regunits(MO.getReg())) {
if (RUDefs.test(Unit) || RUClobbers.test(Unit)) {
// If it's using a non-loop-invariant register, then it's obviously
// not safe to hoist.
Safe = false;
break;
}
}
if (!Safe)
break;
}
it looks like rejecting any instructions having any register inputs given that
// Conservatively treat live-in's as an external def.
// FIXME: That means a reload that're reused in successor block(s) will not
// be LICM'ed.
for (const auto &LI : BB->liveins()) {
for (MCRegUnit Unit : TRI->regunits(LI.PhysReg))
RUDefs.set(Unit);
}
Quite surprisingly, it is not, because on X86 something like this is possible (here $rip
is not a live-in):
renamable $rcx = MOV64rm $rip, 1, $noreg, target-flags(x86-gotpcrel) @x, $noreg, debug-location !22 :: (load (s64) from got); t.ll:5:7
Nevertheless, assuming that the only reason to treat implicit register operands specially is to ignore dead implicit defs for simplicity, it looks harmless to handle any defined register identically, whether it is implicit or explicit.
The test failure looks unrelated:
By the way, the PRs stacked on top of this one passed the tests - maybe |
@@ -554,23 +554,18 @@ void MachineLICMImpl::ProcessMI(MachineInstr *MI, BitVector &RUDefs, | |||
} | |||
|
|||
if (MO.isImplicit()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's unclear to me is why implicit defs are special here as well. I uploaded #147624 which merges the logic for implicit and explicit defs, and it passes the new test from this PR. I also ran a stage2 check-llvm and check-clang and they both pass. It requires updates to some target-specific tests; I'm not an expert on all of the affected ISAs but they seem correct.
Superseded by #147624. |
Take clobbered registers described as implicit-def operands into
account when deciding if an instruction can be moved.
When hasSideEffect was set to 0 in the definition of LOADgotAUTH,
MultiSource/Benchmarks/Ptrdist/ks/ks test from llvm-test-suite started
to crash. The issue was traced down to MachineLICM pass placing
LOADgotAUTH right after an unrelated copy to x16 like rewriting this
code:
like the following: